智能论文笔记

Multimodal contrastive learning for remote sensing tasks

Umangi Jain , Alex Wilson , Varun Gulshan

分类：计算机视觉 | 机器学习

2022-09-06

自我监督的方法在计算机视野领域表现出巨大的成功，包括在遥感和医学成像中的应用。最流行的基于损坏的方法，例如SIMCLR，MOCO，MOCO-V2，通过在图像上应用人为的增强来创建正对并将其与负面示例进行对比，从而使用同一图像的多个视图。尽管这些技术运行良好，但大多数这些技术都在ImageNet（以及类似的计算机视觉数据集）上进行了调整。尽管有一些尝试捕获积极样本中更丰富的变形集，但在这项工作中，我们探索了一种有希望的替代方法，可以在对比度学习框架内为遥感数据生成积极的示例。可以将来自同一位置的不同传感器捕获的图像可以被认为是同一场景的强烈增强实例，从而消除了探索和调整一套手工制作的强大增强的需求。在本文中，我们提出了一个简单的双编码框架，该框架已在Sentinel-1和Sentinel-2图像对的大型未标记数据集（〜1m）上进行了预训练。我们测试了两个遥感下游任务的嵌入：洪水分割和土地覆盖映射，并从经验上表明，从该技术中学到的嵌入优于通过积极的数据增强来收集积极示例的传统技术。

translated by 谷歌翻译

Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)

Umangi Jain , Harish G. Ramaswamy

分类：机器学习

2021-11-25

尽管他们的成功庞大，但培训成功的深度神经网络仍然依赖于实验选择架构，超参数，初始化和培训机制。在这项工作中，我们专注于确定标准梯度下降方法的成功，用于在指定的数据集，体系结构和初始化（DAI）组合上培训深度神经网络。通过广泛的系统实验，我们表明，从DNN的隐藏层获得的矩阵的奇异值的演变可以帮助确定渐变滴定技术的成功，即使在监督学习中没有验证标签的情况下也是如此范例。这种现象可以促进早期放弃，停止训练神经网络，这些网络预计不会概括良好，在训练过程中。我们对多个数据集，架构和初始化的实验表明，所提出的分数可以更准确地预测DAI的成功，而只是依赖于早期时期的验证准确性来作出判断。

translated by 谷歌翻译

Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents

Sayar Ghosh Roy , Anshul Padhi , Risubh Jain , Manish Gupta , Vasudeva Varma

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-31

Multiple studies have focused on predicting the prospective popularity of an online document as a whole, without paying attention to the contributions of its individual parts. We introduce the task of proactively forecasting popularities of sentences within online news documents solely utilizing their natural language content. We model sentence-specific popularity forecasting as a sequence regression task. For training our models, we curate InfoPop, the first dataset containing popularity labels for over 1.7 million sentences from over 50,000 online news documents. To the best of our knowledge, this is the first dataset automatically created using streams of incoming search engine queries to generate sentence-level popularity annotations. We propose a novel transfer learning approach involving sentence salience prediction as an auxiliary task. Our proposed technique coupled with a BERT-based neural model exceeds nDCG values of 0.8 for proactive sentence-specific popularity forecasting. Notably, our study presents a non-trivial takeaway: though popularity and salience are different concepts, transfer learning from salience prediction enhances popularity forecasting. We release InfoPop and make our code publicly available: https://github.com/sayarghoshroy/InfoPopularity

translated by 谷歌翻译

An Investigation of Indian Native Language Phonemic Influences on L2 English Pronunciations

Shelly Jain , Priyanshi Pal , Anil Vuppala , Prasanta Ghosh , Chiranjeevi Yarra

分类：自然语言处理

2022-12-19

Speech systems are sensitive to accent variations. This is especially challenging in the Indian context, with an abundance of languages but a dearth of linguistic studies characterising pronunciation variations. The growing number of L2 English speakers in India reinforces the need to study accents and L1-L2 interactions. We investigate the accents of Indian English (IE) speakers and report in detail our observations, both specific and common to all regions. In particular, we observe the phonemic variations and phonotactics occurring in the speakers' native languages and apply this to their English pronunciations. We demonstrate the influence of 18 Indian languages on IE by comparing the native language pronunciations with IE pronunciations obtained jointly from existing literature studies and phonetically annotated speech of 80 speakers. Consequently, we are able to validate the intuitions of Indian language influences on IE pronunciations by justifying pronunciation rules from the perspective of Indian language phonology. We obtain a comprehensive description in terms of universal and region-specific characteristics of IE, which facilitates accent conversion and adaptation of existing ASR and TTS systems to different Indian accents.

translated by 谷歌翻译

Hybrid Quantum Generative Adversarial Networks for Molecular Simulation and Drug Discovery

Prateek Jain , Srinjoy Ganguly

分类：机器学习

2022-12-15

In molecular research, simulation \& design of molecules are key areas with significant implications for drug development, material science, and other fields. Current classical computational power falls inadequate to simulate any more than small molecules, let alone protein chains on hundreds of peptide. Therefore these experiment are done physically in wet-lab, but it takes a lot of time \& not possible to examine every molecule due to the size of the search area, tens of billions of dollars are spent every year in these research experiments. Molecule simulation \& design has lately advanced significantly by machine learning models, A fresh perspective on the issue of chemical synthesis is provided by deep generative models for graph-structured data. By optimising differentiable models that produce molecular graphs directly, it is feasible to avoid costly search techniques in the discrete and huge space of chemical structures. But these models also suffer from computational limitations when dimensions become huge and consume huge amount of resources. Quantum Generative machine learning in recent years have shown some empirical results promising significant advantages over classical counterparts.

translated by 谷歌翻译

Child PalmID: Contactless Palmprint Recognition

Anil K. Jain , Akash Godbole , Anjoo Bhatnagar , Prem Sewak Sudhish

分类：计算机视觉

2022-12-14

Developing and least developed countries face the dire challenge of ensuring that each child in their country receives required doses of vaccination, adequate nutrition and proper medication. International agencies such as UNICEF, WHO and WFP, among other organizations, strive to find innovative solutions to determine which child has received the benefits and which have not. Biometric recognition systems have been sought out to help solve this problem. To that end, this report establishes a baseline accuracy of a commercial contactless palmprint recognition system that may be deployed for recognizing children in the age group of one to five years old. On a database of contactless palmprint images of one thousand unique palms from 500 children, we establish SOTA authentication accuracy of 90.85% @ FAR of 0.01%, rank-1 identification accuracy of 99.0% (closed set), and FPIR=0.01 @ FNIR=0.3 for open-set identification using PalmMobile SDK from Armatura.

translated by 谷歌翻译

Selective classification using a robust meta-learning approach

Nishant Jain , Pradeep Shenoy

分类：机器学习

2022-12-12

Selective classification involves identifying the subset of test samples that a model can classify with high accuracy, and is important for applications such as automated medical diagnosis. We argue that this capability of identifying uncertain samples is valuable for training classifiers as well, with the aim of building more accurate classifiers. We unify these dual roles by training a single auxiliary meta-network to output an importance weight as a function of the instance. This measure is used at train time to reweight training data, and at test-time to rank test instances for selective classification. A second, key component of our proposal is the meta-objective of minimizing dropout variance (the variance of classifier output when subjected to random weight dropout) for training the metanetwork. We train the classifier together with its metanetwork using a nested objective of minimizing classifier loss on training data and meta-loss on a separate meta-training dataset. We outperform current state-of-the-art on selective classification by substantial margins--for instance, upto 1.9% AUC and 2% accuracy on a real-world diabetic retinopathy dataset. Finally, our meta-learning framework extends naturally to unsupervised domain adaptation, given our unsupervised variance minimization meta-objective. We show cumulative absolute gains of 3.4% / 3.3% accuracy and AUC over the other baselines in domain shift settings on the Retinopathy dataset using unsupervised domain adaptation.

translated by 谷歌翻译

Learning on non-stationary data with re-weighting

Nishant Jain , Pradeep Shenoy

分类：机器学习

2022-12-12

Many real-world learning scenarios face the challenge of slow concept drift, where data distributions change gradually over time. In this setting, we pose the problem of learning temporally sensitive importance weights for training data, in order to optimize predictive accuracy. We propose a class of temporal reweighting functions that can capture multiple timescales of change in the data, as well as instance-specific characteristics. We formulate a bi-level optimization criterion, and an associated meta-learning algorithm, by which these weights can be learned. In particular, our formulation trains an auxiliary network to output weights as a function of training instances, thereby compactly representing the instance weights. We validate our temporal reweighting scheme on a large real-world dataset of 39M images spread over a 9 year period. Our extensive experiments demonstrate the necessity of instance-based temporal reweighting in the dataset, and achieve significant improvements to classical batch-learning approaches. Further, our proposal easily generalizes to a streaming setting and shows significant gains compared to recent continual learning methods.

translated by 谷歌翻译

Structured information extraction from complex scientific text with fine-tuned large language models

Alexander Dunn , John Dagdelen , Nicholas Walker , Sanghoon Lee , Andrew S. Rosen , Gerbrand Ceder , Kristin Persson , Anubhav Jain

分类：自然语言处理

2022-12-10

Intelligently extracting and linking complex scientific information from unstructured text is a challenging endeavor particularly for those inexperienced with natural language processing. Here, we present a simple sequence-to-sequence approach to joint named entity recognition and relation extraction for complex hierarchical information in scientific text. The approach leverages a pre-trained large language model (LLM), GPT-3, that is fine-tuned on approximately 500 pairs of prompts (inputs) and completions (outputs). Information is extracted either from single sentences or across sentences in abstracts/passages, and the output can be returned as simple English sentences or a more structured format, such as a list of JSON objects. We demonstrate that LLMs trained in this way are capable of accurately extracting useful records of complex scientific knowledge for three representative tasks in materials chemistry: linking dopants with their host materials, cataloging metal-organic frameworks, and general chemistry/phase/morphology/application information extraction. This approach represents a simple, accessible, and highly-flexible route to obtaining large databases of structured knowledge extracted from unstructured text. An online demo is available at http://www.matscholar.com/info-extraction.

translated by 谷歌翻译

Task-Directed Exploration in Continuous POMDPs for Robotic Manipulation of Articulated Objects

Aidan Curtis , Leslie Kaelbling , Siddarth Jain

分类：机器人

2022-12-08

Representing and reasoning about uncertainty is crucial for autonomous agents acting in partially observable environments with noisy sensors. Partially observable Markov decision processes (POMDPs) serve as a general framework for representing problems in which uncertainty is an important factor. Online sample-based POMDP methods have emerged as efficient approaches to solving large POMDPs and have been shown to extend to continuous domains. However, these solutions struggle to find long-horizon plans in problems with significant uncertainty. Exploration heuristics can help guide planning, but many real-world settings contain significant task-irrelevant uncertainty that might distract from the task objective. In this paper, we propose STRUG, an online POMDP solver capable of handling domains that require long-horizon planning with significant task-relevant and task-irrelevant uncertainty. We demonstrate our solution on several temporally extended versions of toy POMDP problems as well as robotic manipulation of articulated objects using a neural perception frontend to construct a distribution of possible models. Our results show that STRUG outperforms the current sample-based online POMDP solvers on several tasks.

translated by 谷歌翻译